A Study of Bilinear Models in Voice Conversion

نویسندگان

  • Victor Popa
  • Jani Nurminen
  • Moncef Gabbouj
چکیده

This paper presents a voice conversion technique based on bilinear models and introduces the concept of contextual modeling. The bilinear approach reformulates the spectral envelope representation from line spectral frequencies feature to a two-factor parameterization corresponding to speaker identity and phonetic information, the so-called style and content factors. This decomposition offers a flexible representation suitable for voice conversion and facilitates the use of efficient training algorithms based on singular value decomposition. In a contextual approach (bilinear) models are trained on subsets of the training data selected on the fly at conversion time depending on the characteristics of the feature vector to be converted. The performance of bilinear models and context modeling is evaluated in objective and perceptual tests by comparison with the popular GMM-based voice conversion method for several sizes and different types of training data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

A novel technique for voice conversion based on style and content decomposition with bilinear models

This paper presents a novel technique for voice conversion by solving a two-factor task using bilinear models. The spectral content of the speech represented as line spectral frequencies is separated into so-called style and content parameterizations using a framework proposed in [1]. This formulation of the voice conversion problem in terms of style and content offers a flexible representation...

متن کامل

طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی

Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...

متن کامل

ON THE STATIONARY PROBABILITY DENSITY FUNCTION OF BILINEAR TIME SERIES MODELS: A NUMERICAL APPROACH

In this paper, we show that the Chapman-Kolmogorov formula could be used as a recursive formula for computing the m-step-ahead conditional density of a Markov bilinear model. The stationary marginal probability density function of the model may be approximated by the m-step-ahead conditional density for sufficiently large m.

متن کامل

A flexible and modular crosslingual voice conversion system

A cross-lingual voice conversion system aims at modifying the timbral structure of recorded sentences from a source speaker, in order to obtain processed sentences which are perceived as the same sentences uttered by a target speaker. This work presents the cross-lingual voice conversion problem as a network of related sub-problems and discuss several techniques for solving each of these sub-pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Signal and Information Processing

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2011